library(aplore3)
library(ggplot2)
library(plotly)
library(RColorBrewer)
library(pheatmap)
library(cluster)
Data Format: A data.frame with 500 rows and 18 variables such as:
priorfrac - If the patient previously had a fracture
age
weight
height
bmi
premeno
momfrac
armassist
smoke
raterisk
fracscore
fracture
bonemed - Bone
medications at enrollment (1: No, 2: Yes)
bonemed_fu - Bone
medications at follow-up (1: No, 2: Yes)
bonetreat - Bone
medications both at enrollment and follow-up (1: No, 2: Yes)
head(glow_bonemed)
## sub_id site_id phy_id priorfrac age weight height bmi premeno momfrac
## 1 1 1 14 No 62 70.3 158 28.16055 No No
## 2 2 4 284 No 65 87.1 160 34.02344 No No
## 3 3 6 305 Yes 88 50.8 157 20.60936 No Yes
## 4 4 6 309 No 82 62.1 160 24.25781 No No
## 5 5 1 37 No 61 68.0 152 29.43213 No No
## 6 6 5 299 Yes 67 68.0 161 26.23356 No No
## armassist smoke raterisk fracscore fracture bonemed bonemed_fu bonetreat
## 1 No No Same 1 No No No No
## 2 No No Same 2 No No No No
## 3 Yes No Less 11 No No No No
## 4 No No Less 5 No No No No
## 5 No No Same 1 No No No No
## 6 No Yes Same 4 No No No No
summary(glow_bonemed)
## sub_id site_id phy_id priorfrac age
## Min. : 1.0 Min. :1.000 Min. : 1.00 No :374 Min. :55.00
## 1st Qu.:125.8 1st Qu.:2.000 1st Qu.: 57.75 Yes:126 1st Qu.:61.00
## Median :250.5 Median :3.000 Median :182.50 Median :67.00
## Mean :250.5 Mean :3.436 Mean :178.55 Mean :68.56
## 3rd Qu.:375.2 3rd Qu.:5.000 3rd Qu.:298.00 3rd Qu.:76.00
## Max. :500.0 Max. :6.000 Max. :325.00 Max. :90.00
## weight height bmi premeno momfrac armassist
## Min. : 39.90 Min. :134.0 Min. :14.88 No :403 No :435 No :312
## 1st Qu.: 59.90 1st Qu.:157.0 1st Qu.:23.27 Yes: 97 Yes: 65 Yes:188
## Median : 68.00 Median :161.5 Median :26.42
## Mean : 71.82 Mean :161.4 Mean :27.55
## 3rd Qu.: 81.30 3rd Qu.:165.0 3rd Qu.:30.79
## Max. :127.00 Max. :199.0 Max. :49.08
## smoke raterisk fracscore fracture bonemed bonemed_fu
## No :465 Less :167 Min. : 0.000 No :375 No :371 No :361
## Yes: 35 Same :186 1st Qu.: 2.000 Yes:125 Yes:129 Yes:139
## Greater:147 Median : 3.000
## Mean : 3.698
## 3rd Qu.: 5.000
## Max. :11.000
## bonetreat
## No :382
## Yes:118
##
##
##
##
Age vs Weight: As weight increases the average age decreases
Age
vs Height: Weak correlation of as height increases age decreases
Age
vs BMI: As bmi increases the average age decreases
Age vs fracscore:
As age increases the average fracscore increases
Weight vs Height: As height increases the average weight
increases
Weight vs BMI: As bmi increases the average weight
increases
Weight vs fracscore: As fracscore increases the average
Weight decreases
Height vs BMI: As bmi increases the average height and variance stay
the same
Height vs fracscore: As fracscore increases the average
height stays the same though variance might decrease
BMI vs fracscore: As fracscore increases the average bmi
decreases
plot(glow_bonemed[, c(5:8, 14)])
Non of the following scatter plots show strong groupings for the
fracture/no fracture categorical variable
ggplot(glow_bonemed, aes(x = age, y = bmi, color = fracture)) +
geom_jitter()+
labs(title = "BMI vs Age")
ggplot(glow_bonemed, aes(x = bmi, y = fracscore, color = fracture)) +
geom_jitter()+
labs(title = "Fracture Score vs BMI")
ggplot(glow_bonemed, aes(x = fracscore, y = age, color = fracture)) +
geom_jitter()+
labs(title = "Age vs Fracture Score")
ggplot(glow_bonemed, aes(x = weight, y = height, color = fracture)) +
geom_jitter()+
labs(title = "Height vs Weight")
Once again there doesn’t seem to be strong groupings of the fracture categorical variable
fracture3dplot = plot_ly(glow_bonemed,
x = ~age,
y = ~height,
z = ~bmi,
color = ~fracture,
colors = c('#0C4B8E', '#BF382A')) %>% add_markers()
fracture3dplot
The boxplot shows that the mean fracscore seems to be slightly higher for smokers compared to non smokers
ggplot(glow_bonemed, aes(x = smoke, y = fracscore)) +
geom_boxplot()+
labs(title = "Fracture Score Summary Statistics for Smokers vs Non Smokers")
pheatmap(glow_bonemed[, c(5,8)], scale = "column", fontsize_row = 0.1, cluster_cols = F, legend = T, color = colorRampPalette(c("blue", "white", "red"), space = "rgb")(100))
pheatmap(glow_bonemed[, 5:8], scale = "column", fontsize_row = 0.1, cluster_cols = F, legend = T, color = colorRampPalette(c("blue", "white", "red"), space = "rgb")(100))
zScoreScale = scale(glow_bonemed[, 5:8])
zScoreDistance = dist(zScoreScale)
continuousVariableClustering = hclust(zScoreDistance, method = "complete")
plot(continuousVariableClustering)
Sources:
Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X. (2013)
Applied Logistic Regression, 3rd ed., New York: Wiley
https://cran.r-project.org/web/packages/aplore3/aplore3.pdf#page=11&zoom=100,132,90